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Abstract 

A core element of microeconomics and game theory is that consumers have valuation functions over bundles of 
goods and that these valuation functions drive their purchases. In particular, the value assigned to a bundle need not 
be the sum of values on the individual items but rather is often a more complex function of how the items relate. The 
literature considers a hierarchy of valuation classes that includes subadditive, XOS (i.e. fractionally subadditive), 
submodular, and OXS valuations. Typically it is assumed that these valuations are known to the center or that they 
come from a known distribution. Two recent lines of work, by Goemans et al. (SODA 2009) and by Balcan and 
Harvey (STOC 2011), have considered a more realistic setting in which valuations are learned from data, focusing 
specifically on submodular functions. 

In this paper we consider the approximate learnability of valuation functions at all levels in the hierarchy. We 
first study their learnability in the distributional learning (PAC-style) setting due to Balcan and Harvey (STOC 201 1). 
We provide nearly tight lower and upper bounds of Q(t), 1 ' 2 ) on the approximation factor for learning XOS and 
subadditive valuations, both important classes that are strictly more general than submodular valuations. Interestingly, 
we show that the G>(n 1/ ' 2 ) lower bound can be circumvented for XOS functions of polynomial complexity; we 
provide an algorithm for learning the class of XOS valuations with a representation of polynomial size to within 
an 0(n e ) approximation factor in running time n ' 1//£ ' for any e > 0. We also establish learnability and hardness 
results for subclasses of the class of submodular valuations, i.e. gross substitutes valuations and interesting subclasses 
of OXS valuations. 

In proving our results for the distributional learning setting, we provide novel structural results for all these 
classes of valuations. We show the implications of these results for the learning everywhere with value queries 
model, considered by Goemans et al. (SODA 2009). 

Finally, we also introduce a more realistic variation of these models for economic settings, in which information 
on the value of a bundle 5* of goods can only be inferred based on whether S is purchased or not at a specific price. 
We provide lower and upper bounds for learning both in the distributional setting and with value queries. 
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1 Introduction 



A central problem in commerce is understanding one's customers. Whether for assigning prices to goods, for deciding 
how to bundle products, or for estimating how much inventory to carry, it is critical for a company to understand its 
customers' preferences. In Economics and Game Theory, these preferences are typically modeled as valuations, or 
monotone set functions, over subsets of goods. It is usually assumed that consumers' valuations are known in advance 
to the company, or that they are drawn from a known distribution. In practice, however, these valuations must be 
learned. For example, given past data of customer purchases of different bundles, a retailer would like to estimate how 
much a (typical) customer would be willing to pay for new packages of goods that become available. Companies may 
also conduct surveys querying customers about their valuation^. 

Motivated by such scenarios, in this paper we investigate the learnability of classes of functions commonly used to 
model consumers' valuations. In particular, we focus on a wide class of valuations expressing "no complementarities": 
the value of the union of two disjoint bundles is no more than the sum of the values on each bundle — we henceforth use 
the standard optimization terminology subadditive valuations. We provide upper and lower bounds on the learnability 
of valuation classes in a popular hierarchy JT8] [24] [28] [29), with submodular functions (the only class with similar 
extant results l5l [T7lD halfway in the hierarchy: 

OXS C gross substitutes C submodular (the only related learnability results (5]|T7)) £ XOS C subadditive 

We analyze the learnability of these classes in the natural PMAC model |5J for approximate distributional learning. 
In this model, a learning algorithm is given a collection S = {Si, ... , S m } of polynomially many labeled examples 
drawn i.i.d. from some fixed, but unknown, distribution D over points (sets) in 2^ n \ The points are labeled by a fixed, 
but unknown, target function /* : 2^ — > M+. The goal is to output in polynomial time, with high probability, a 
hypothesis function / that is a good multiplicative approximation for /* over most sets with respect to D. More 
formally, we want: 

Pr Sl ,...,s m ~u [ Prs~D[/(S)</*(S)<a/(S)] > 1-e] > 1 - S 

for an algorithm that uses m = poly(n, |, |) samples and that runs in poly(n, i, |) time. In contrast, the classical 
PAC model ||32) requires predicting exactly (i.e. a = 1) with high probability the values of /* over most sets with 
respect to D. Thus the PMAC model can be viewed as an approximation-algorithms extension of the traditional PAC 
model. 

Our main results in the PMAC-learning model are for superclasses of submodular valuations, namely subadditive 
valuations and XOS lfl3l [I4l 124) (also known as fractionally subadditive I61Q3)) valuations. A XOS valuation repre- 
sents a set of alternatives (e.g. travel destinations), where the valuation for subsets of goods (e.g. attractions) within 
each alternative is additive. The value of any set of goods, e.g. dining and skiing, is the highest value for these goods 
among all alternatives. That is, an XOS valuation is essentially a depth-two tree with a MAX root over SUM trees with 
goods as leaves. XOS valuations are intuitive and very expressive: they can represent any submodular valuation ||24| 
and can approximate any valuation in the subadditive superclass to a O(logn) factor l6l [T2l . We also consider sub- 
classes of submodular functions in the hierarchy, namely gross substitutes |fT0l[T8l and OXS (9j [TT] |T4j [30| functions. 
Gross substitutes valuations are characterized by the lack of pairwise synergies among items: for example, if the value 
of each of three items is the same, then no pair can have a strictly higher value than the other two pairs. Finally, the 
OXS class includes valuations representable as the SUM of MAX of item values. All these classes include linear 
valuations. 

We also analyze the model of approximate learning everywhere with value queries, due to Goemans et al. ifTT) . In 
this model, the learner can adaptively pick a sequence of sets Si, S2, ■ • ■ and query the values /*(Si), /*(Sa), 
Unlike the high confidence and high accuracy requirements of PMAC, this model requires approximately learning 
/*(•) with certainty on all 2" sets. We provide upper and lower bounds in this model for the same valuation classes. 

Finally, we introduce a more realistic variation of these models, in which the learner can obtain information only 
via prices. This variation is natural in settings where an agent with valuation /* is interested in purchases of goods. 

1 See e.g. |http : / /bit . ly/l s 7 7 4 D for an example of an airline asking customers for a "reasonable" price for in-flight Internet. 
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Our Results. We establish lower and upper bounds, the most general of them being almost tight, on the learnability 
of valuation classes in the aforementioned hierarchy. 

1. We show a nearly tight 0(-Jn) upper bound and Cl(s/n/ logn) lower bound on the learnability of XOS val- 
uations in the PMAC model. The key element in our upper bound is to show that any XOS function can be 
approximated by the square root of a linear function to within a factor O(yfn). Using this, we then reduce the 
problem of PMAC-learning XOS valuations to the standard problem of learning linear separators in the PAC 
model which can be done via a number of efficient algorithms. Our Vl(^Jn/ logn) lower bound is information 
theoretic, applying to any procedure that uses a polynomial number of samples. We also show an 0(^/n\ogn) 
upper bound on the learnability of subadditive valuations in the PMAC model. 

2. We establish a target-dependent learnability result for XOS functions. Namely, we show the class of XOS 
functions representable with at most R trees can be PMAC-learned to an 0(R V ) factor in time n ' 1 /''' for any 
77 > 0. In particular, for R polynomial in n, we get learnability to an 0(n v ) factor in time n ^ 1 / 71 ^ for any 77 > 0. 
Technically, we prove this result via a novel structural result showing that a XOS function can be approximated 
well by the L-th root of a degree-!/ polynomial over the natural feature representation of the set S. Conceptually, 
this result highlights the importance of the complexity of the target function for polynomial time learning. 

3. By exploiting novel structural results on approximability with simple functions, we provide much better upper 
bounds for other interesting subclasses of OXS and XOS. These include OXS and XOS functions with a small 
number of leaves per tree and OXS functions with a small number of trees. Some of these classes have been con- 
sidered in the context of economic optimization problems [4 6 , 8 1, but we are the first to study their learnability. 
We also show that the previous ^(n 1 / 3 ) lower bound for PMAC-learning submodular functions @ applies to 
the much simpler class of gross substitutes. 

4. The structural results we derive for analyzing learnability in the distributional learning setting also have implica- 
tions for the model of exact learning with value queries J7] [17] [3T) . In particular, they lead to new upper bounds 
for XOS and OXS as well as new lower bounds for XOS, gross substitutes, and OXS. 

5. Finally, we introduce a new model for learning with prices in which the learner receives less information on 
the values f*{Si), /*(S , 2), • ■ • : for each I, the learner can only quote a price pi and observe whether the agent 
buys Si or not, i.e. whether pi < f * (Si) or not. This model is more realistic in economic settings where agents 
interact with a seller via prices only. Interestingly, many of our upper bounds, both for PMAC-learning and 
learning with value queries, are preserved in this model (all lower bounds automatically continue to hold). 

Our results are summarized in Table Q] Note that all our upper bounds are efficient and all the lower bounds are 
information theoretic. Our analysis has a number of interesting byproducts that should be of interest to the Combina- 
torial Optimization community. For example, it implies that recent lower bounds of Q [16] [TT] [5T) on optimization 
under submodular cost functions also apply to the smaller classes of OXS and gross substitutes. 

Related Work We study classes of valuations with fundamental properties (subadditivity and submodularity) or that 
are natural constructs used widely for optimization in economic settings 0281 : XOS |[6j [T3] [T4] [T3] |24), i.e. MAX of 
SUMs, OXS SHE] ED, i.e. SUM of MAXs, and gross substitutes, fundamental in allocation problems l2l [TOl [T8l . 

We focus on two widely studied learning paradigms: approximate learnability in a distributional setting irTl l2Tll32l 
1331 and approximate learning everywhere with value queries J7] [TT] [22] [3T| . In the first paradigm, we use a model 
introduced by for the approximate learnability of submodular functions. We circumvent the main negative result 
in J5] for certain interesting classes and match the main positive result in J5] for the more general XOS and subadditive 
classes. With a few recent exceptions irT7l[3Tl . models for the value queries paradigm require exact learning and are 
necessarily limited to much less general function classes than the ones we study here: read-once and Toolbox DNF 
valuations |7j, polynomial or linear-threshold valuations ll23l or MAX or SUM (of bundles) valuations (22 j. The latter 
two works also consider demand queries, where the learner can specify a set of prices and obtain a preferred bundle at 
these prices. In contrast, in our variation of learning with prices, the learner and the agent focus on one price only (for 
the current bundle) instead of as many as 2" prices. 

2 Since the class of XOS functions representable with at most a polynomial number of trees has small complexity, learnability would be imme- 
diate if we did not care about computational efficiency. 
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subadditive 


fc)(n ' ) [this paper] 


l/2\ r.i ■ n 

B(n ' J [this paper] 


C(n) [folklore] 


U{n) [this paper] 


xos 

XOS with < R trees 


(^(n 1 / 2 ) [this paper] 
O(R-) [this paper] 


^(n 1 / 2 ) [this paper] 
0(R E ) [this paper] 


^(n 1 / 2 ) [this paper] QT] 
O(-R) [this paper] 


^(n 1 / 2 ) [this paper] 
0(R) [this paper] 


submodular 


(fi(n 1/3 )>O(» 1/2 ))0 


(r>(™ 1/3 ),o(n 1/2 )) 

[this paper] 






gross substitutes 


ri(?i 1 / 3 ) [this paper] 


^(n 1 / 3 ) [this paper] 


^(n 1 / 2 ) [this paper] 


^(n 1 / 2 ) [this paper] 


OXS with < R trees 
or < R leaves per tree 


O(-R) [this paper] 


O(-R) [this paper] 


0(R) [this paper] 


0(R) [this paper] 



Table 1 : Lower and upper bounds for learnability factors achievable in different models for standard classes of valu- 
ations (presented in decreasing order of generality). All the upper bounds refer to polynomial time algorithms. Our 
construction for the f^n 1 / 2 / log n) = ^(n 1 / 2 ) lower bound on learning XOS valuations with value queries is simpler 
than the construction for the same asymptotic lower bound of Goemans et al. ifTTl . 



Paper structure After defining valuation classes and our models in Section|2] we study the distributional learnability 
of valuation classes in decreasing order of generality. First, Section [3] presents our results on XOS and subadditive 
valuations, including our most general bounds, that are almost tight. Section [4] presents a hardness result for gross 
substitutes and positive results on several interesting subclasses of OXS. Section|5]provides positive results for most 
of these classes in learning with value queries. Finally, in Section [6] we show that many of our results extend to a 
natural framework in economic applications, even though the learner receives less information in this framework. 

2 Preliminaries 

We consider a universe [n] ={1, • • • ,n} of items and valuations, i.e. monotone non-negative set functions / : 2^ — > 
R+: f(S U {i}) > f(S) > 0, VS C [n], Vi g S. For a set S C [n] we denote by X (S) G {0, 1}™ its indicator vector; 
so (x(.S))i = 1 ^ * G5 and (x(S))i = if i$.S. We often use this natural isomorphism between {0, 1}™ and 2^. 
Classes of Valuation Functions. It is often the case that valuations are quite structured in terms of representation or 
constraints on the values of different sets. We now define and give intuition for most valuation classes we focus on. 

The following standard properties have natural interpretations in economic settings. A subadditive valuation mod- 
els the lack of synergies among sets: a set's value is at most the sum of the values of its parts. A submodular valuation 
models decreasing marginal returns: an item j's marginal value cannot go up if one expands the base set S by item i. 

Definition 1. A valuation f : 2^ -> R+ is called subadditive if and only iff(SllS') < f(S) + f(S'),VS, S' C [n]. 
A valuation f is called submodular ifandonlyiff(Sl){i,, ;'}) — f(St>{i}) < f(SU{j})-f(S), VS* C [n], Vi,j £ S. 

XOS is an important class of subadditive, but not necessarily submodular, valuations studied in combinatorial 
auctions lfl3l [141 [1511241 . A valuation is XOS if and only if it can be represented as a depth-two tree with a MAX 
root and SUM inner nodes. Each such SUM node has as leaves a subset of items with associated positive weights. 
For example, a traveler may choose the destination of maximum value among several different locations, where each 
location has a number of amenities and the valuation for a location is linear in the set of amenities. 

Definition 2. A valuation f is XOS if and only if it can be represented as the maximum of k linear valuations, for 
some k > 1. That is, f(S) = maxj—i...k wJx(S) where Wji > 0, Vj = 1 . . . k, Vi = 1 . . . n. 

We say that item i appears as a leaf in a SUM tree j if i has a positive value in tree j. 

As already mentioned, any submodular valuation can be expressed as xosS When reversing the roles of operators 
MAX and SUM we obtain a strict subclass of submodular valuations, called OXsQthat is also relevant to auctions IfTTl 

3 As showin in 1241 , any submodular / can be represented as the MAX of n! SUM trees, each with n leaves: for every permutation tt 
of [n], we build a SUM tree Tt, with one leaf for each item j £ [n], where item 7r(j) has weight its marginal value /({7r(l), . . . , ir(j — 
1), " /(«l),...,7r(i-l)}). 

4 XOS and OXS stand for XOR-of-OR-of-Singletons and OR-of-XOR-of-Singletons, where MAX is denoted by XOR and SUM by OR I27II291 . 
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[I4l l24l [30l . To define OXS we also define a unit-demand valuation, in which the value of any set S is the highest 
weight of any item in S. A unit-demand valuation is essentially a tree, with a MAX root and one leaf for each item with 
non-zero associated weight. In an OXS valuation, a set's value is given by the best way to split the set among several 
unit-demand valuations. An OXS valuation / has a natural representation as a depth-two tree, with a SUM node at the 
root (on level 0), and subtrees^] corresponding to the unit-demand valuations f±, . . . , The value f(S) of any set S 
corresponds to best way of partitioning S into (Si, . . . , Sk) and adding up the per-tree values {/i(5i), . . . , fk(Sk)}- 

Definition 3. A unit-demand valuation / is given by weights {wi, wjclf such that f(S) = maxi 6 s w% , V S C [n]. 
An OXS valuation / is given by the convolution ofk > 1 unit- demand valuations fx, ... , //.: that is, 
f(S) = max{/i(S'i) H h fk(Sk) : (Si,...,Sk) is a partition ofS},VSC [n]. 

Finally, we consider gross substitutes (GS) valuations, of great interest in allocation problems iflOl [181 |28ll . Infor- 
mally, an agent with a gross substitutes valuation would not buy fewer items of one type (e.g. skis) if items of another 
type (e.g. snowboards) became more expensive. That is, items can be substituted one for another in a certain sense, 
which is not the case for, e.g. skis and ski boots. See Section|4]for a formal definition and more detailed discussion. 

As already mentioned, the classes of valuations we reviewed thus far form a strict hierarchy. (See 11241 for examples 
separating these classes of valuations.) 

Lemma 1. H24V OXS C gross substitutes C submodular C XOS C subadditive. 

Only the class of submodular valuations has been studied from an approximate learning perspective ll5l [T7ll . We 
study the approximate learnability of all other classes in this hierarchy, in a few natural models that we introduce now. 
Distributional Learning: PMAC. We primarily study learning in the PMAC model of 0. We assume that the input 
for a learning algorithm is a set S of polynomially many labeled examples drawn i.i.d. from some fixed, but unknown, 
distribution D over points in 2^. The points are labeled by a fixed, but unknown, target function /* : 2^ — > R+. 
The goal is to output a hypothesis function / such that, with high probability over the choice of examples, the set of 
points for which / is a good approximation for /* has large measure with respect to D. Formally: 

Definition 4. We say that a family T of valuations is PMAC-learnable with approximation factor a if there exists 
an algorithm A such that for any distribution D over 2^ n \ for any target function f* G J-, and for any sufficiently 
small e > 0, 5 > 0, A takes as input samples {(Si, /*(6'j))}i<,< m where each Si is drawn independently from D 
and outputs a valuation f : 2 [n l -> M such thatPT Slt ...,s m ~D [Pr^^^ [f{S) < f*{S) < af(S)] >l — e]>l — 6. 
A must use m = poly(n, i, j) samples and must have running time poly(n, i, j). 

PMAC stands for Probably Mostly Approximately Correct (the PAC model (321 is a special case of PMAC with a = 1). 
Learning with Value Queries. We also consider the model of learnability everywhere (in the same approximate sense) 
with value queries. In this model, the learning algorithm is allowed to query the value of the unknown target function 
/* on a polynomial number of sets S\ , S2, ■ ■ ■ , that may be chosen in an adaptive fashion. The algorithm must then 
output in polynomial time a function / that approximates /* everywhere, namely f(S) < f*(S) < af(S), V5 C [n]. 
A formal definition of this model and the results are presented in Section|5] 

Learning with Prices. This framework aims to model economic interactions more realistically and considers a setting 
where an agent with the target valuation /* is interested in purchasing bundles of goods. In this framework, the learner 
does not obtain the value of /* on each input set Si, S2, ■ ■ ■ ■ Instead, for each input set Si the learner quotes a price 
pi on Si and observes whether the agent purchases Si or not, i.e. whether pi < f*(Si) or not. The goal remains to 
approximate the function /* well, i.e. within an a multiplicative factor: on most sets from D with high confidence 
for PMAC-learning and on all sets with certainty for learning everywhere with value queries. This framework and the 
associated results are presented in Section|6] 

3 PMAC-learnability of XOS valuations and subadditive valuations 

In this section we give nearly tight lower and upper bounds of 9(v / n) for the PMAC-learnability of XOS and sub- 
additive valuations. In contrast, there is a <9(n 1//6 ) gap between the existing bounds for submodular valuations 0. 

5 Another OXS encoding uses a weighted bipartite graph G„ where edge has the weight of item i in fj\ f(S) is the weight of a maximum 
matching of S to the k nodes for the unit-demand fj's. Also, OXS valuations with weights {0,1} are exactly rank functions of transversal matroids. 
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Furthermore, we reveal the importance of considering the complexity of the target function (in a natural representa- 
tion) for polynomial-time PMAC learning. We show that XOS valuations representable with a polynomial number of 
SUM trees are PMAC-learnable to a n n factor in time n 1 / 1 ', for any r\ > 0. Finally, we show that XOS valuations 
representable with an arbitrary number of SUM trees, each with at most R leaves, are PMAC-learnable to an R factor. 

3.1 Nearly tight lower and upper bounds for learning XOS and subadditive functions 

We establish our 0(y/ri) bounds by showing an f2(-y/n/logn) lower bound for the class of XOS valuations (hence 
valid for subadditive valuations) and upper bounds of 0(y/n) and 0( \fn log n) for the classes of XOS and subadditive 
valuations respectively. We note that our lower bound construction is much simpler and gives a better bound than the 
^(n 1 / 3 / log n) construction of 0. However, the latter construction is for matroid rank functions, a significantly 
smaller class. For our upper bounds we provide structural results showing that XOS and subadditive functions can be 
approximated by a linear function to an 0{y/n) and 0(y/rilnn) factor respectively. We can then PMAC-learn these 
classes via a reduction to the classical problem of PAC-learning a linear separator. 

Theorem 1. The classes o/XOS and subadditive functions are PMAC-learnable to a <d(y/n) approximation factor. 
Proof Sketch: Lower bound: We start with an information theoretic lower bound showing that the class of XOS 
valuations cannot be learned with an approximation factor of o( -^^) from a polynomial number of samples. 

Let k = ni lo s lc, g™. For large enough n we can show that there exist sets Ai, A%, A^ C [n] such that 

(i) y/n/2 < |A| < 2y/n for any 1 < i < k, i.e. all sets have large size Q(y/n) and 

(ii) | Aj fl Aj\ < log n for any 1 < i < j < k, i.e. all pairwise intersections have small size 0(log n). 

We achieve this via a simple probabilistic argument where we construct each A by picking each element in [n] 
with probability -^=. Let random variables Yi = \Ai\ and Xi j = | A n Aj \ . Obviously, E[Yj] = y/n and E[Xij] = 1. By 
Chernoff bounds, 

Pr [ vV 2 < y i < 2 Vn] > 1 - 2e~^" /8 and Pr [Xy > Inn] < = n -^ lnlan - 1 \ VI < i < j < k. 

By union bound the probability that (i) and (ii) hold is at least 1 - 2 k e~^l % - fc2 n -(inln«.-i) > q 

Given the existence of the family A = {A\, . . . , Ak] of sets with properties (i) and (ii) above, we construct a 
hard family of XOS functions as follows. For any subfamily B C A, we construct an XOS function /g with large 
values for sets A; S B and small values for sets A% £ B. Let fiAi(S) — | S" n A^ | for any S C [n]. For any subfamily 
B C A, define the XOS function f B by f s (S) = MAX At eB h A , (S). We claim that /b(A) = n(>/n). if M G B but 
fe{Ai) = 0(log n), if A ^ B. Indeed, for any A G B, we have h A% (A) = I A| > s/n/2, hence / B (A) = ^iy/H); 
for any Aj £ B, by our construction of A, we have fiAi(Aj) = | A H Aj \ < log n, implying /g (Aj ) = O (log n) . For 
an unknown B, the problem of learning /g within a factor of o(y/n/ log n) under a uniform distribution on A amounts 
to distinguishing B from A. This is not possible from a polynomial number of samples since \A\ = fi3 lo e |o s". In 
particular, if B C A is chosen at random, then any algorithm from a polynomial-sized sample will have error 
on a region of probability mass greater than ^ — po iy( n ) ■ 

Upper bounds: We show that the class of XOS valuations can be PMAC-learned to a 0(y/n) factor and that the class 
of subadditive valuations can be PMAC-learned to a 0(\/n\ogn) factor, by using 0{- log f-) training examples and 
running time poly(n, i, j). To prove these bounds we start by providing a structural result (Claim[T]below) showing 
that XOS valuations can be approximated to a y/n factor by the square root of a linear function. 

Claim 1. Let f : 2^ — > R + be a non-negative XOS function with /(0) = 0. Then there exists a function f of the 
form f(S) = y / w T x{S) where w G R" such that f(S) < f(S) < y/kf(S) for all S C [n]. 

Proof. XOS valuations are known |15 | to be equivalent to fractionally subadditive valuations. A function / : 2^ —> R 
is called fractionally subadditive if f(T) < J^s ^sf(S) whenever A5 > and J2s-ses — 1 f° r anv s *= T. 

We can show the following property of XOS valuations: for any XOS / we have f(T) = max {X^igT x i\ x ^ P(f)}> 
where P(f) is the associated polyhedron {x G R™ : J2ies Xi — f(^)^^ ^ [ n ]}- Informally, this result states that 
one recovers f(T) when optimizing in the direction given by T over the polyhedron P(f) associated with /. The 
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proof of this result involves a pair of dual linear programs, one corresponding to the maximization and another one 
that is tailored for fractional subadditivity, with an optimal objective value of f(T). Formally, for any T C [n] we 
have J2i<£T Xi — /CO f° r an y x e P(f)- Therefore f(T) > max{^ jeT Xi\x £ P(f)}. Now we prove that in fact 

/(T)< max Xili eP(/)}. 

Consider the linear programming (LPl) for the quantity max {x(T)\x £ P{f)} and its dual (LP2): we assign a dual 
variable y$ for each constraint in (LPl), and we have a constraint corresponding to each primal variable indicating that 
the total amount of dual corresponding to a primal variable should not exceed its coefficient in the primal objective. 

max^x, (LPl) min £ y s f(S) (LP2) 

ieT SC[n] 

s.t. Xi < f(S) VS* C [n], s.t. ^2 y s > 1 Vie T, 

ies s-.ies 

x,>0 ViG[n]. ?/s>0 V5C[n]. 

The classical theory of linear optimization gives that the optimal primal solution equals the optimal dual solution. 
Let y* be an optimal solution of (LP2). Therefore 

]T y* s f(S)=m^{x(T)\x£P(f)}. 

S£[n] 

Since / is fractionally subadditive and Y^s-iesVs — £ T, we have /(T) < 2sc[n] Vsfi^)' hence /(T) < 
max{x(T)|a; £ P(f)}. This completes the proof of the fact that f(T) = max {J^ieT x i\ x ^ -P(f)}- 

Given this result, we proceed as follows (a very similar approach is used by [17] for submodular functions). Define 
P = {x £ R" : (|xi|, |x n |) £ P(f)}. Since P is bounded and central symmetric (i.e. igPo —x £ P), there 
exists ll20l an ellipsoid £ containing P such that -^=8 is contained in P. Hence for f(T) = max{^ ieT X{ : x £ 

we have /(T)</(T)<Vn/(T),VT C [n]. At last, basic calculus implies f(T) = y/w J x{T) for some w £ E". □ 

For PMAC-learning XOS valuations to with an approximation factor of + e, we apply Algorithm [TJ with 
parameters R = n, e, and p = 2. The proof of correctness of Algorithm [TJ follows by using the structural result in 
Claim[TJand a technique of Q that we sketch briefly here. Full details of this proof appear in Appendix lAl 

Assume first that /* (S) > for all 5^0. The key idea is that Claim[TJs structural result implies that the following 
examples in R n+1 are linearly separable since nw J x{S) - (f*(S)) 2 > and nw J x(S) - (n + e)(f*(S)) 2 < 0. 

Examples labeled +1: cx+ := ( X (S), (/* (S)) 2 ) VS C [n] 

Examples labeled -1: ex^ := ( X (S), (n + e) • (f*(S)) 2 ) VS C [n] 

This suggests trying to reduce our learning problem to the standard problem of learning a linear separator for these 
examples in the standard PAC model fl2Tl[33l . However, in order to apply standard techniques to learn such a linear 
separator, we must ensure that our training examples are i.i.d. To achieve this, we create a i.i.d. distribution D' in 
]R n+1 that is related to the original distribution D as follows. First, we draw a sample S C [n] from the distribution D 
and then flip a fair coin for each. The sample from D' is labeled cx^ i.e. +1 if the coin is heads and cx^ i.e. —1 if the 
coin is tails. As mentioned above, these labeled examples are linearly separable in R n+1 . Conversely, suppose we can 
find a linear separator that classifies most of the examples coming from D' correctly. Assume that this linear separator 
in R" +1 is defined by the function u J x = 0, where u = (w, —z), w £ R" and z > 0. The key observation is that the 
function f(S) = jt^^ w x(S) approximates (/* (-)) 2 to within a factor n + eon most of the points coming from D. 

If /* is zero on non-empty sets, then we can learn its set Z = { S : f*(S) — } of zeros quickly since Z is 
closed to union and taking subsets for any subadditive /*. In particular, suppose that there is at least an e chance that 
a new example is a zero of /*, but does not lie in the null subcube over the sample. Then such a example should be 
seen in the next sequence of log(l/J)/e examples, with probability at least 1 — S. This new example increases the 
dimension of the null subcube by at least one, and therefore this can happen at most n times. 
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To establish learnability for the class of subadditive valuations, we note that any subadditive valuation can be 
approximated by an XOS valuation to a Inn factor lfl2l [6) El an d so, by Claim Q] any subadditive valuation is ap- 
proximated to a y/n Inn factor by a linear function. This then implies that we can use Algorithm Q] with parameters 
R = n In 2 n, e, and p = 2. Correctness then follows by a reasoning similar to the one for XOS functions. □ 

Algorithm 1 Algorithm for PMAC-learning via a reduction to a binary linear separator problem. 
Input: Parameters: R, e and p. Training examples S = {(Si,f* (Si)), . . . , (S m , f* (S m ))}. 

• Let S^o = {(Ai, f*(Ai)) £S : f*(Ai) ^=0} C S the examples with non-zero values, So = S \ S^o and 

— U;< m; /.(s,) = o5;. 

• For each i in {1, ... , |5^o|} let Vi be the outcome of independently flipping a fair {+1, — l}-valued coin. 

Ux(A l ),(f*(A l ))P) (if ^ = +1) 

1 ( X (Ai), (R + e)- (f*(Ai))P ) (if w = -1). 



Let Xi £ W L+1 be the point defined by Xi 



• Find a linear separator u = (w, —z) £ R n+1 , where w £ R" and z > 0, such that (x, sgn(u J x)) is consistent 
with the labeled examples (xi, yi) Vi £ {1, . . . , |5^o |}, and with the additional constraint that Wj = Vj £ Uq. 

Output: The function / defined as f(S) = ( n^f^. u> J x(S) 



3.2 Better learnability results for XOS valuations with polynomial complexity 

In this section we consider the learnability of XOS valuations representable with a polynomial number of trees. Since 
this class has small complexity, it is easy to see that it is learnable in principle from a small sample size if we did 
not care about computational complexity. Interestingly we can show that we can achieve good PMAC learnability via 
polynomial time algorithms. In particular, we show that XOS functions representable with at most R SUM trees can 
be PMAC-learned with a R' 1 approximation factor in time n°^ 1 / ,1 \ for any 77 > 0. This improves the approximation 
factor of Theorem Q] for all such XOS functions. Moreover, this implies that XOS valuations representable with a 
polynomial number of trees can be PMAC-learned within a factor of n v , in time n ' 1 /^, for any 77 > 0. 

Theorem 2. For any 77 > 0, the class of XOS functions representable with at most R = SUM trees is PMAC- 

learnable in time n ' 1 /'') with approximation factor of (R + e) v by using O (y^— 
examples. 



training 



Proof. Let L = \jr\ and assume for simplicity that it is integer. We start by deriving a key structural result. We show 
that XOS functions can be approximated well by the L-th root of a degree-L polynomial over (x(S))i for i £ [n]. 
Let T\, . . . , Tr be the R SUM trees in an XOS representation T of /*. For a tree j and a leaf in Tj corresponding 
to an element i £ [n], let wji the weight of the leaf. For any set S, let kj(S) = X^gT ns w J i ~ w Jx(S) be the 
sum of weights in tree Tj corresponding to leaves in S. kj(S) is the value assigned to set S by tree Tj. Note that 
f*(S) = maxj kj(S), i.e. the maximum value of any tree, from the definition of MAX . We define valuation /' 
that averages the L-th powers of the values of all trees: f'(S) = l/R^j kj(S), \fS C [n\. We claim that /'(■) 
approximates (f *(-)) L to within an R factor on all sets S, namely 

f(S) < (f*(S)) L < Rf'(S), VS C [n] i.e. l/R^j kf(S) < ma Xj kf(S) < £■ kf(S), VS C [n] (1) 

The left-hand side inequalities in Eq. (Q]i follow as /* has at most R trees and kff, (S) < m&Xj kj'(S) for any tree Ty . 
The right-hand side inequalities in Eq. ((TJ follow immediately. 

This structural result suggests re-representing each set S by a new set of 0(n L ) features, with one feature for each 
subset of [n] with at most L items. Formally, for any set S C [n], we denote by xm(S) its feature representation 



6 We are grateful to Shahar Dobzinski and Kshipra Bhawalkar for pointing out this fact to us. 
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over this new set of features. XM(S)i 1 ,i 2 , — ,ii = 1 if all items i\, i 2 , ■ ■ . il appear in S and XM(S)i 1 ,i 2 ,...,i L = 
otherwise. It is easy to see that /' is representable as a linear function over this new set of features. This holds for each 
kj'(S) = (wj x(S)) L due to its multinomial expansion, that contains one term for each set of up to L items appearing 
in tree Tj, i.e. for each such feature. Furthermore, /' remains linear when the terms for each tree Tj are added. 

Given this, we can now use a variant of AlgorithmQ]with parameters R, e, and p — L and to prove correctness we 
can use a reasoning similar to the one in Theorem[T] Any sample Si is fed into Algorifhm[T]as (xm(Si), (f*(Si)) L ) or 
(xm(Si), (R+e)-(f* (Si)) L ) respectively. Since /' is linear over the set of features, Algorithm[T]outputs with probabil- 
ity at least 1 — S a hypothesis /" that approximates /* to an (R+e) 1 ^ factor on any point xm{S) corresponding to sets 
S C [n] from a collection S with at least an 1 -e measure in D, i.e. /"(xm(S)) < f*(S) < {R + ef/ L /"(xm(5)). 
We can output then hypothesis f(S) = f" (xAi(S)),yS C [n], defined on the initial ground set [n] of items, that 
approximates /* (•) well, i.e. for any Se5we have 

f(S) = f"(x M (S)) < f*(S) < (R + e)V L f(xM(S)) = (R + e)V L f(S) 

As desired, with high confidence the hypothesis / approximates /* to a (R + e) v factor on most sets from D. □ 

This result has an appealing interpretation in terms of representations of submodular functions. We know that any 
submodular function is representable as an XOS tree. What Theorem[2]implies is that (submodular) functions that are 
succinctly representable as XOS trees can be PMAC-learned well. Theorem|2]is thus a target-dependent learnability 
result, in that the extent of learnability of a function depends on the function's complexity. 

3.3 Better learnability results for XOS valuations with small SUM trees 

In this section we consider the learnability of another interesting subclass of XOS valuations, namely XOS valuations 
representable with "small" SUM trees and show learnability to a better factor than that in Theorem[T] For example, 
consider a traveler deciding between many trips, each to a different location with a small number of tourist attractions. 
The traveler has an additive value for several attractions at the same location. This valuation can be represented as 
an XOS function where each SUM tree stands for a location and has a small number of leaves. We now show good 
PMAC-learning guarantees for classes of functions of this type. 

Theorem 3. For any rj > 0, the class of XOS functions representable with SUM trees with at most R leaves is 
properly PMAC-learnable with approximation factor of R(\ + rf) by using m = 0(\ (nloglog 1+ „(^) + log(l/<5))) 
and running time polynomial in m, where h and H are the smallest and the largest non-zero values our functions can 
take. 

Proof. We show that the unit-demand hypothesis / output by Algorithm|2]produces the desired result. The algorithm 
constructs a unit demand hypothesis function / as follows. For any i that appears in at least one set Sj in the sample 
we define /(«') as the smallest value f*(Sj) over all the sets Sj in the sample containing i. For i that does not appear 
in any set Sj define /(«') = 0. 

We start by proving a key structural result showing that / approximates the target function multiplicatively within 
a factor of R over the sample. That means: 

f(Si) </*(Sj) <Rf(S t ) for alHe {l,2,...,m}. (2) 

To see this note that for any i € Si we have f*(i) < f* {Si), for I £ {1,2,..., to}. So 

/(*)>/'(») foranyieS 1 U...US m . (3) 

Therefore for any I E {1,2,..., m}. : 

f*(Si) < i?max/*( l ) < Rmaxf(t) - Rf(S t ), 

where the first inequality follows by definition, and the second inequality follow from relation (fj). By definition, for 
any i £ Si, f(i) < f* (Si). Thus, f(Si) = max ie g, /(£) < /* (Si). These together imply relation (f2]i, as desired. 



Algorithm 2 Algorithm for PMAC-learning interesting classes of XOS and OXS valuations. 
Input: A sequence of training examples S = {(Si, f*(Si)), (S 2 , f*(S 2 )), ■ ■ ■ (S m , f*(S m ))}. 

• Set /(*) = min,:,: 5 f*(Sj) ifi G Uj^Sj and /(») = if i £ U£ X S,. 

Output: The unit-demand valuation / defined by f(S) = maxi 6 s f(i) for any SC {1, . . . , n}. 



To finish the proof we show that m = 0(| (n log log 1+JJ (^) + log(l/ 5))) is sufficient so that with probability at 
least 1 — 5 / approximates the target function /* multiplicatively within a factor of R(l + rj) 2 on a 1 — e fraction of 
the distribution. Let F v be the class of unit-demand functions that assign to each individual leaf a power of (1 + rj) in 
[h,H]. Clearly |F,| = (log 1+7) (^))™. It is easy to see that m = 0{\ (n loglog 1+J) (-f-) + log(l/<5))) examples are 
sufficient such that any function in F v that approximates the target function on the sample multiplicatively within a 
factor of R(l + rj) will with probability at least 1 — 6 approximate the target function multiplicatively within a factor 
of R(l + rj) on a 1 — e fraction of the distribution. Since F v is a multiplicative L x cover for the class of unit-demand 
functions, we easily get the desired result (TJ. □ 

4 PMAC-learnability of OXS and Gross Substitutes Valuations 

In this section we study the learnability of subclasses of submodular valuations, namely OXS and gross substitutes. We 
start by focusing on interesting subclasses of OXS functions that arise in practice, namely OXS functions representable 
with a small number of MAX trees or leavefl For example, a traveler presented with a collection of plane tickets, 
hotel rooms, and rental cars for a given location might value the bundle as the sum of his values on the best ticket, 
the best hotel room, and the best rental car. This valuation is OXS, with one MAX tree for each travel requirement. 
The number of MAX trees, i.e. travel requirements, is small but the number of leaves in each tree may be large. As 
another example, consider for example a company producing airplanes that must procure many different components 
for assembling an airplane. The number of suppliers for each component is small, but the number of components may 
be very large (more than a million in today's airplanes). The company's value for a set of components of the same 
type, each from a different supplier, is its highest value for any such component. The company's value for a set of 
components of different types is the sum of the values for each type. This valuation is representable as an OXS, with 
one tree for each component type. The number of leaves, i.e. suppliers, in each MAX tree is small but there may be 
many such trees. In this section, we show good PMAC-learning guarantees for classes of functions of these types. 
Formally: 

Theorem 4. (1) Let T be the family of OXS functions representable with at most R MAX trees. For any rj, the family 
T is properly PMAC-learnable with approximation factor of R{l-\-rj) by using m = 0(i (nloglog 1+JJ (-^) + log(l/<$))) 
training examples and running time polynomial in m, where h and H are the smallest and the largest value our func- 
tions can take. For constant R, the class T is PAC-learnable by using 0(n R log(n/5)/e) training examples and 
running time poly(n, 1/e, 1/(5). 

(2) For any e > 0, the class of OXS functions representable with MAX trees with at most R leaves is PMAC- 
learnable with approximation factor R+e by using 0(— log (jr)) training examples and running time poly(n, 1/e, 1/5). 

Proof sketch. (1) We can show that a function / with an OXS representation T with at most R trees can also be 
represented as an XOS function with at most R leaves per tree. Indeed, for each tuple of leaves, one from each tree in 
T, we create an SUM tree with these leaves. The XOS representation of /* is the MAX of all these trees. Given this 
the fact that T is learnable to a factor of of R(l + rf) for any rj follows from Theorem|3] 

We now show that when R is constant the class T is PAC-learnable. First, using a similar argument to the 
one in Theorem [3] we can show that Algorithm [2] can be used to PAC-learn any unit-demand valuation by using 
m = 0(n \x\(n/5)/e) training examples and time poly(ri, 1/e, 1/5) - see Lemma[3]in Appendix |B1 Second, it is easy 

7 We note that the literature on algorithms for secretary problems (3]|4) often considers a subclass of the latter class, in which each item must 
have the same value in any tree. 
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to see that an OXS function /* representable with at most R trees can also be represented as a unit-demand with at 
most n R leaves, with i?-tuples as items (see Lemma|4]in Appendix[B]i. These two facts together imply that for constant 
R, the class F is PAC-learnable by using 0(n R log(n/<5)/e) training examples and running time poly(n, 1/e, l/S). 

(2) We start by showing the following structural result: if /* has an OXSrepresentation with at most R leaves in 
any MAX tree, then it can be approximated by a linear function within a factor of R on every subset of the ground 
set. In particular, the linear function / defined as f(S) — J2ies /*(*')> f° r all 5 C {1 ... 77} satisfies 

f*(S)<f(S)<R-f*(S) for all SC{l..,n} (4) 

By subadditivity, f*(S) < Rf(S), for all S. Let f* 1 , . . . f* k be the unit-demand functions that define /*. Fix 
a set S C [n]. For any item i £ S, define ji to be the index of the f*j under which item i has highest value: 
f*{i) = /%•({*'})■ Then for the partition (Si,...,Sk) of S in which item i is mapped to Sj { for any i, we have 
E< 6 5 /*(*") ^ Rf\(S 1 ) + ...Rf* k (S k ). Therefore: 

/(S) = &£ i6 s/*(0 < max (Sl St)partItiono[S (/* 1 (S 1 ) + ... + /* fc ( Sfc )) = /*(<?), 

where the last equality follows simply from the definition of an OXS function. 

Given the structural result|4] we can PMAC-learn the class of OXS functions representable with MAX trees with 
at most R leaves y using Algorithm[TJwith parameters R, e and p = l. The correctness by using a reasoning similar to 
the one in TheoremUJ □ 

We now consider the class of Gross Substitutes valuations, a superclass of OXS valuations and a subclass of 
submodular valuations (recall Lemma [TJ. Gross Substitutes are fundamental to allocation problems with per-item 
prices ifTUl [T8l |28l ; in particular a set of per-item market-clearing prices exists if and (almost) only if all customers 
have gross substitutes valuations. A valuation is gross substitutes if raising prices on some items preserves the demand 
on other items. Given prices on items, an agent with valuation / demands a preferred set, formalized as follows. 

Definition 5. For price vector p 6 K™, the demand correspondence T> ± (p) of valuation f is the collection of preferred 
sets at prices p, i.e. T> j{p) = arg max,sc{i, ...,«}{/($) — TlijesPi}- ^ valuation f is gross substitutes (GS) if for any 
price vectors $ > p (i.e. p\ > piii € [n]), and any A £ (p) there exists A' £ T> ^ (p 1 ) with A' D {i g A : pi = p^}. 

That is, the GS property requires that all items i in some preferred set A at the old prices p and for which the old 
and new prices are equal (pi = p'A are simultaneously contained in some preferred set A 1 at the new prices p 1 . 

As mentioned earlier Balcan and Harvey |5| proved that it is hard to PMAC-learn the class of submodular func- 
tions with an approximation factor o(?i 1 / 3 / logn). We show here that their result applies even for the class of gross 
substitutes. This is quite surprising since such functions are typically considered easy from an economic optimization 
point of view. Specifically: 

Theorem 5. No algorithm can PMAC-learn the class of gross substitutes with an approximation factor o/o(n 1 / 3 /log n) 
This holds even if D is known and value queries are allowed. 

Proof. It is known that the class of matroid rank functions cannot be PMAC-learned with an approximation factor of 
o(p}/ 3 /\ogn), even if D is known and value queries are allowed Q. One can show that a matroid rank function is a 
gross substitutes function (see Lemma|2]below). Combining these, yields the theorem. □ 

Our key tool for proving that any matroid rank function is also GS (Lemma|2]below) is a valuation-based charac- 
terization of gross substitutes valuations due to ll25l . 

Lemma 2. A matroid rank function is gross substitutes. 

Proof. Denote f's marginal value over S by f s (A) = f{S U A) - f(S),VAC [n]\S. As shown in (25] / is GS if 
and only if 

f s (ab) + f s {c) < max{/ s (ac) + f s (b), f s (bc) + f s (a)} for all items a, b, c and set S (5) 
i.e. (by taking permutations) there is no unique maximizer among f s (ab) + f s {c) 7 f s (ac) + f s {b), f s {bc) + f s (a). 
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If / is matroid rank function, then so is f s ; in particular, f s (A) < \A\,VA C [n]\ S. We reason by case analysis. 

Suppose that f s (ab) = 2. Then we have f s (a) = f s (b) = 1. If f s (ac) = 2 or f s (bc) = 2, then f s (c) = 1 and 
hence the inequality (0 holds. On the other hand, if f s (ac) = f s (bc) = 1, then we have by the monotonicity and 
submodularity of / s , f s (ab) + f s (c) < f s (abc) + f s (c) < f s (ac) + f s (bc) = 2, and the inequality © holds. 

If f S (ab) < 1 then f s (ab) = max{/ s (a), f s (b)}. As f s (c) < f s {ac) and f s {c) < f s (bc), Eq. © follows. □ 

We note that Lemma [2] was previously proven in fl26l in a more involved way via the concept of -concavity 
from discrete convex analysis. 

5 Learnability everywhere with value queries 

In this section, we consider approximate learning with value queries ifTTl [3D . This is relevant for settings where 
instead of passively observing the values of /* on sets 5 drawn from a distribution, the learner is able to actively query 
the value f*(S) on sets S of its choice and the goal is to approximate with certainty the target /* on all 2" sets after 
querying the values of /* on polynomially many sets. Formally: 

Definition 6. We say that an algorithm A learns the valuation family J- everywhere with value queries with an 
approximation factor of a > 1 if, for any target function f* £ J-, after querying the values of f* on polynomially (in 
n) many sets, A outputs in time polynomial in n a function f such that f(S) < /* (S) < a/(5), V5 C {1, . . . , n}. 

Goemans et al. [ 17 1 show that for submodular functions the learnability factor with value queries is eiy/ 2 ). we 
show here that their lower bound applies to the more restricted OXS and GS classes (their upper bound automatically 
applies). We also show that this lower bound can be circumvented for the interesting subclasses of OXS and XOS 
that we considered earlier, efficiently achieving a factor of R. 

Theorem 6. (1) The classes of OXS and GS functions are learnable with value queries with an approximation factor 
ofSin 1 / 2 ). 

(2) The following classes are learnable with value queries with an approximation factor of R: OXS with at most 
R leaves in each tree, OXS with at most R trees, XOS with at most R leaves in each tree, and XOS with at most R 
trees. 

Proof Sketch. (1) We show in AppendixIClthat the family of valuation functions used in lfT71 for proving 
lower bound for learning submodular valuations with value queries is contained in OXS. The valuations in this family 
are of the form 523(5*) = min(|5|, a') <m<\g R (S) = min( ) S + |S'n (({1, . . . , n})\i?)|, \S\, a') for a' ^xn 1 / 2 /5,/3 = 
x 2 /5 with x 2 = a; (log n ) and R a subset of {1, ... ,n} of size a' (chosen uniformly at random). These valuations are 
OXS; for example, 523(5*) can be expressed as a SUM of a' MAX trees, each having as leaves all items in [n] with 
weight 1. 

(2) To establish learnability for these interesting subclasses, we recall that for the first three of them (Theorems [3] 
and H]i any valuation /* in each class was approximated to an R factor by a function / that only depended on the 
values of /* on items. An analogous result holds for the fourth class, i.e. XOS with at most R trees - indeed, for such 
an XOS /*, we have ^ J2 l& s /*({*}) < f*( s ) < R jl E ie s /*(W)>V5 C [n]. One can then query these 71 values 
and output the corresponding valuation /. □ 

Note: We note that the lower bound technique in IfTTl has been later used in a sequence of papers IT91 [161 [3D 
concerning optimization under submodular cost functions and our result (Lemma |6] in particular) implies that all the 
lower bounds in these papers apply to the smaller classes of OXS functions and GS functions. 

Note: We also note that since XOS contains all submodular valuations, the lower bound of Goemans et al. |[T7l 

implies that the XOS class is not learnable everywhere with value queries to a o(j^j) = o(n}/ 2 ) factor. For the same 

1/2 

^(fogn) lower bound, our proof technique (and associated family of XOS valuations) for Theorem[T]offers a simpler 
argument than that in ifrTll . 



11 



6 Learning with prices 



We now introduce a new paradigm that is natural in many applications where the learner can repeatedly obtain in- 
formation on the unknown valuation function of an agent via the agent's decisions to purchase or not rather than via 
random samples from this valuation or via queries to it. In this framework, the learner does not obtain the value of /* 

on each input set Si, S2, Instead, for each input set Si, the learner observes Si, quotes a price pi (of its choosing) 

on Si and obtains one bit of information: whether the agent purchases Si or not, i.e. whether pi < f* (Si) or not. The 
goal remains to approximate the function /* well, i.e. within an a multiplicative factor: on most sets from D with high 
confidence for PMAC-learning and on all sets with certainty for learning everywhere with value queries. The learner's 
challenge is in choosing prices that allow discovery of the agent valuation. This framework is a special case of demand 
queries |28l , where prices are: pi on Si and 00 elsewhere. We call PMAC-learning with prices and VQ-learning with 
prices the variants of this framework applied to our two learning models. Each variant in this framework offers less 
information to the learner than its respective basic model. 

Clearly, all our PMAC-learning lower bounds still hold for PMAC-learning with prices. More interestingly, our 
upper bounds still hold as well. In particular, we provide a reduction from the problem of PMAC-learning with prices 
to the problem of learning a linear separator, for functions /* such that for some p > 0, (f*) p can be approximated 
to a B factor by a linear function. Such /* can be PMAC-learned to a B x ^ p factor by Algorithm Q] What we show 
in Theorem|7]below is that such /* are PMAC-learnable with prices to a factor of (1 + o(\))B 1 l p using only a small 
increase in the number of samples over that used for (standard) PMAC learning. For convenience, we assume in this 
section that all valuations are integral and that H is an upper bound on the values of /*, i.e. /* (S) < H, \/S C [n]. 

Theorem 7. Consider a family !F of valuations such that the p-th power of any f* G J- can be approximated to a 
B factor by a linear function: i.e., for some w we havew T X {S) < (f*{S)) p < Bw T x(S) for all S C [n], where 
> l,p > 0. Then for any < 77 < 1, the family T is PMAC-learnable with prices to a (1 + r\)B x ' p factor using 
0(H}££K i n (!ii^L)) samples and time poly(n, ±, ±, A). 

Proof. As in Algorithm!]] the idea is to use a reduction to learning a linear separator, but where now the examples use 
prices (that the algorithm can choose) instead of function values (that the algorithm can no longer observe). For each 
input set Si, the purchase decision amounts to a comparison between the chosen price qi and f*(Si). Using the result 
of this comparison we will construct examples, based on the prices qi, that are always consistent with a linear separator 
obtained from w j x(Sl), the linear function that approximates (f*) p . We will sample enough sets Si and assign prices 
qi to them in such a way that for sufficiently many I, the price qi is close to /* (Si). We then find a linear separator that 
has small error on the distribution induced by the price-based examples and show this yield a hypothesis f(S) whose 
error is not much higher on the original distribution with respect to the values of the (unknown) target function. 

Specifically, we take m = 0( nl °^ H ln( " l °^ s H )) samples, and for convenience define N = Ll°gi+ ?; /3 H\. We 
assign to each input bundle Si a price qi drawn uniformly at random from {(1 + 7//3) 1 } for i = 0, 1, 2, . . . , N + 1, 
and present bundle Si to the agent at price qi. The key point is that for bundles Si such that /* (Si) > 1 this ensures 
at least a — ^_ probability that f*(S t )(l + ?y/3) _1 < qi < f*(Si) and at least a -^jA- probability that f*(Si) < qi < 
f*(Si)(l + 77/3) (the case of f*(Si) = will be noticed when the agent does not purchase at price qi = 1 and is 
handled as in the proof of Theorem[T]i. 

We construct new examples based on these prices and purchase decisions as follows. If /* (Si) < qi (i.e. the agent 
does not buy) then we let (x t ,yi) = ((x(Si), Bqf), -1). If/* (Si) > qi (i.e. the agent buys) then we let (x u yi) = 
((x(Si), qf), +1). Note that by our given assumption, the examples constructed are always linearly separable. In 
particular the label?// matches sgn((/3w,-l) T x ; ) in each case: Pw J x(St) < B(f*(Si)) p < Bqf^dqf < {f*{Si)) p < 
Bw J x(Si) respectively. Let -D^ 1 denote the induced distribution on IR™ +1 . We now find a linear separator (w, —z) € 
M n+1 , where w € R™ and z S R+, that is consistent with (xi,yi),\/l. We construct an intermediary hypothesis 
f'(S) = -j^w T x(S) based on the learned linear separator. The hypothesis output will be f(S) = (/'(S)) 1 ^- 

By standard VC-dimension sample-complexity bounds, our sample size m is sufficient that the linear separator 
(w, —z) has error on -D"^ at most with probability at least 1 — S. We now show that this implies that with 
probability at least 1 — 5, hypothesis f(S) approximates /* (S) to a factor (1 + /y/3) 2 /3 1 / p < (1 + i^B 1 ^ over D, on 
all but at most an e probability mass, as desired. 
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Specifically, consider some bundle S for which f(S) does not approximate f*(S) to a factor (1 + r]/3) 2 f3 1 / p and 
for which f*(S) > 1 (recall that zeroes are handled separately). We just need to show that for such bundles S, there 
is at least a jA-^ probability (over the draw of price q) that (w, —z) makes a mistake on the resulting example from 
Dbuy ■ There are two cases to consider: 

1. It could be that / is a bad approximation because f(S) > f*(S). This implies that f'(S) > [(1 + r//3)/*(5)] p 
or equivalently that w~x(S) > (iz[{l + T]/3)f*(S)] p . In this case we use the fact that there is a chance that 
f*(S) <q< f*(S)(l + 77/3). If this occurs, then the agent doesn't buy (yielding x = (x(S), Pq p ),y = -1) 
and yet w J x(S) > fizq p . Thus the separator mistakenly predicts positive. 

2. Alternatively it could be that (1 + rj/3) 2 (3 1 ^f{S) < f* (S). This implies that (1 + r]/3)p^ p f'(S)^ p < f* (S) 
or equivalently that w T x(S) < z ( (+f/\ ) p - ^ n tn i s case ' we use tne f act tnat there is a chance that jzf^ < 
q < f*(S). If this occurs, then the agent does buy (yielding x = (x(S'), q p ), y = +1) and yet w T x(5) < zq p . 
Thus the separator mistakenly predicts negative. 

Thus, the error rate under -D^Jy 1 is at least a fraction of the error rate under D, and so a low error under -D^, 1 
implies a low error under D as desired. □ 

Note: We note that if there is an underlying desired pricing algorithm A, for each input set Si we can take the price 
of Si to be A(Si) with probability 1 — e and a uniformly at random price in {1, 2, 4 ... , H/2, H} as in the previous 
result with probability e. The sample complexity of learning only goes up by a factor of at most lo s H lo s log _ _ 

We can also recover our upper bounds on learnability everywhere with value queries (the corresponding lower 
bounds clearly hold). By sequentially setting prices 1,2,4..., H/2, H on each item we can learn /*'s values on items 
within a factor of 2. Our structural results proving the approximability of /* from interesting classes with a function 
that only depends on /* ({1}), . . . , f *({n}) then yield the VQ-learnability with prices of these classes. 

Theorem 8. The following classes are VQ-learnable with prices to within an 2R factor: OXS with at most R trees, 
OXS with at most R leaves in each tree, XOS with at most R trees, and XOS with at most R leaves in each tree. 



7 Conclusions 

In this paper we study the approximate learnability of valuations commonly used throughout economics and game 
theory for the quantitative encoding of agent preferences. We provide upper and lower bounds regarding the learnabil- 
ity of important subclasses of valuation functions that express no-complementarities. Our main results concern their 
approximate learnability in the distributional learning (PAC-style) setting. We provide nearly tight lower and upper 
bounds of G)(?i 1 / 2 ) on the approximation factor for learning XOS and subadditive valuations, both widely studied 
superclasses of submodular valuations. Interestingly, we show that the ^(n 1 / 2 ) lower bound can be circumvented for 
XOS functions of polynomial complexity; we provide an algorithm for learning the class of XOS valuations with 
a representation of polynomial size achieving an 0(n £ ) approximation factor in time O^ 1 / 6 ) for any e > 0. This 
highlights the importance of considering the complexity of the target function for polynomial time learning. We also 
provide new learning results for interesting subclasses of submodular functions. Our upper bounds for distributional 
learning leverage novel structural results for all these valuation classes. We show that many of these results provide 
new learnability results in the Goemans et al. model ifTTl of approximate learning everywhere via value queries. 

We also introduce a new model that is more realistic in economic settings, in which the learner can set prices and 
observe purchase decisions at these prices rather than observing the valuation function directly. In this model, most of 
our upper bounds continue to hold despite the fact that the learner receives less information (both for learning in the 
distributional setting and with value queries), while our lower bounds naturally extend. 
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A Additional Details for the Proof of Theorem 3] 

For PMAC-learning XOS valuations to an \Jn + e factor, we apply Algorithm[T]with parameters R = n, e, and p = 2. 
The proof of correctness of Algorithm [TJ follows by using the structural result in Claim [TJ and a technique of 0. We 
provide here the full details of this proof. 

Because of the multiplicative error allowed by the PMAC-learning model, we separately analyze the subset of the 
instance space where /* is zero and the subset of the instance space where /* is non-zero. For convenience, we define: 

V = {S : f*(S)^Q} and Z = { S : f*(S) = } . 

The main idea of our algorithm is to reduce our learning problem to the standard problem of learning a binary classifier 
(in fact, a linear separator) from i.i.d. samples in the passive, supervised learning setting [21 33 ] with a slight twist in 
order to handle the points in Z. The problem of learning a linear separator in the passive supervised learning setting 
is one where the instance space is M. m , the samples come from some fixed and unknown distribution D' on R m , and 
there is a fixed but unknown target function c* : M m — > { — 1, +1}, c*(x) = sgn(u J x). The examples induced by D' 
and c* are called linearly separable since there exists a vector u such that c*(x) = sgn(w T a;). The linear separator 
learning problem we reduce to is defined as follows. The instance space is K m where m = n + 1 and the distribution 
D' is defined by the following procedure for generating a sample from it. Repeatedly draw a sample S C [n] from the 
distribution D until f*(S) ^ 0. Next, flip a fair coin for each. The sample from D' is 

(X(S), (f*(S)) 2 ) (if the coin is heads) 
(x(S),(n + e) ■ (f*(S)) 2 ) (if the coin is tails). 
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The function c* defining the labels is as follows: samples for which the coin was heads are labeled +1, and the others 
are labeled — 1. We claim that the distribution over labeled examples induced by D' and c* is linearly separable in 
R" +1 . To prove this we use the assumption that for the linear function f(S) = w x(S) with w € R™, we have 
(f*(S)) 2 < f(S) < n(f*(S)) 2 for all S C [n]. Letu = ((n + e/2) • w, -1) G R m . For any point x in the support of 
D' we have 

x=( X (S),(f*(S)) 2 ) =s> u J x=(n + e/2)-f(S)-(r(S)) 2 >0 
x=( x (S),(n + e)-(r(S)) 2 ) u J x = (n + e/2) ■ f(S) - (n + e) ■ (/* (S)) 2 < 0. 

This proves the claim. Moreover, this linear function also satisfies f(S) = for every S G Z. In particular, f(S) = 
for all S E So and moreover, 

/({j}) = w 3 ' = for every j G Uu where Wo = Us i£ 2 Sj. 

Our algorithm is as follows. It first partitions the training set S = {(Si, f*(Si)), ■ ■ ■ , (S m , f*(S m ))} into two sets 
So and S^o, where So is the subsequence of S with = 0, and S^o = S \ So- For convenience, let us denote 

the sequence S^o as 

S#o = ((A 1 J*(A 1 )),...,(A a J*(A a ))). 

Note that a is a random variable and we can think of the sets the Ai as drawn independently from D, conditioned on 
belonging to V. Let 

Uo = U i< m S t and T = { S : S CU } ■ 

/*(S»)=0 

Using S^lo, the algorithm then constructs a sequence <S>^ = ((sci, yi), ■ . ■ , (x a , j/ a )) of training examples for the 
binary classification problem. For each 1 < i < a, let yi be +1 or —1, each with probability 1/2. If yi = +1 set 
x i = ix(Ai), (f*(Ai)) 2 ); otherwise set Xi = (x(Ai), (n + e) ■ {f*(Ai}) 2 ). The last step of our algorithm is to solve 
a linear program in order to find a linear separator u = (w, —z) where w G IK", z G K consistent with the labeled 
examples (xi,yi), i = 1 < i < a, with the additional constraints that Wj = for j G Uo- The output hypothesis is 
f(S) = ((^^X(S)) 1 / 2 . 

To prove correctness, note first that the linear program is feasible; this follows from our earlier discussion using 
the facts (1) S^ Q is a set of labeled examples drawn from D' and labeled by c* and (2) Uo Q Ud- It remains to show 
that / approximates the target on most of the points. Let y denote the set of points S G V such that both of the points 
(x(S)i if*{S)) 2 ) and (x{S), (n + e) ■ (f*(S)) 2 ) are correctly labeled by sgn(w T x), the linear separator found by our 
algorithm. It is easy to show that the function f(S) — (j^tj^w 1 'x(S)) 1 ^ 2 approximates /* to within a factor n + e 
on all the points in the set y. To see this notice that for any point S G y, we have 

w J X (S)-z(r(S)) 2 >0 and w J X (S) - z(n + e)(f* (S)) 2 < 
-w J x(S) < (f*(S)) 2 < (n + e)-^—w J X (S). 



(n + e)z (n + e)z 

So, for any point in S G y, the function f(S) 2 = jz^^ w x(S) approximates (/*(-)) 2 to within a factor n + e. 
Moreover, by design the function / correctly labels as all the examples in Vq. To finish the proof, we now note two 
important facts: for our choice of m — log ( 4M, with high probability both V \ y and Z\Vo have small measure. 

Claim 1. With probability at least 1 — 8, the set Z \ Vq has measure at most e. 

Proof. Let Vk = { S : S C Uk } ■ Suppose that, for some k, the set Z \ Vk has measure at least e. Define 
k' = k + log(n/<5) / e. Then amongst the subsequent examples Sk+i , ■ ■ ■ , SV, the probability that none of them lie in 
Z \ V k is at most (1 - e)' ^™/ 15 )/ 6 < 8/n. On the other hand, if one of them does lie in Z \ V k , then \U k > \ > \U k \- 
But \Uk\ < n for all k, so this can happen at most n times. Since m > n \og(n/8)/e, with probability at least 5 the set 
Z \ V m has measure at most e. □ 

We now prove: 
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Claim 2. If m = -22 log (4*-), then with probability at least 1 — 25, the set V\y has measure at most 2e under D. 

Proof of Claim\2\ Let q = 1 — p = Prs~D [S S V\. If q < e then the claim is immediate, since V has measure at 
most e. So assume that q > e. Let /x = E [a] = qm. By assumption // > 16nlog(n/<5e) a . Then Chernoff bounds 
give that 

Pr a < 8nlog(n/<5e)~ < exp(— n\og(n/S)q/e) < 5. 

So with probability at least 1 — 5, we have a > 8nlog(g r n/5e)-. By a standard sample complexity argument ||33l 
with probability at least 1 — 5, any linear separator consistent with S' will be inconsistent with the labels on a set 
of measure at most e/q under D'. In particular, this property holds for the linear separator c computed by the linear 
program. So for any set S, the conditional probability that either (x(<S), if*(S)) 2 ) or (x(S)i (n + e) ■ (f*(S)) 2 ) is 
incorrectly labeled, given that S E V , is at most 2e/q. Thus 

Pt{S eV k S gy] = Pt[S eV]-Pr[S &y | SeV] < q-(2e/q), 

as required. □ 

In summary, our algorithm outputs a hypothesis / approximating /* to within a factor (n + e) 1 / 2 on y U ? m . The 
complement of this set is (Z \ Vo) U (V \ y), which has measure at most 3e, with probability at least 1 — 35. 



B Additional Results for Theorem |U 

We prove that Algorithm[2]can be used to PAC-learn (i.e. PMAC-learn with a = 1) any unit-demand valuation. 

Lemma 3. The class of unit-demand valuations is properly PAC-learnable by using m = 0(nhx(n/5)/e) training 
examples and time poly(n, 1/e, 1/5). 

Proof. We first show how to solve the consistency problem in polynomial time: givenasample (Si, /*(<Si)), . . . , (S m , f*(S m )) 
we show how to construct in polynomial time a unit-demand function / that is consistent with the sample, i.e., 
f(Si) = f*(Si), for I € {1, 2, . . . , m}. In particular, using the reasoning in Theorem [3] for R = 1, we show that 
the unit-demand hypothesis / output by Algorithm|2]is consistent with the samples. We have 

f(S t ) = max/M = max mm f(Sj) < 

Also note that for any i e Si we have f*(i) < f*{Si), for I e {1,2, ...,m}. So f(i) > f*(i) for any i g 
Si U . . . U S m - Therefore for any I € {1, 2, . . . , m} we have : 

f*{Si) = max/* (i) < max/(i) - 
igSi ieSi 

Thus /-(Si) = f(Si) for J G {1, 2, . . . , m}. 

We now claim that m = 0(nln(n/5)/s) training examples are sufficient so that with probability at least 1 — 
5, the hypothesis / produced has error at most e. In particular, notice that Algorithm |2] guarantees that f(i) € 
{/*(1), ...,/* (n)} for all i. This means that for any given target function /*, there are at most n n different possible 
hypotheses / that Algorithm|2]could generate. By the union bound, the probability that the algorithm outputs one of 
error greater than e is at most n n (l — e) m which is at most 5 for our given choice of m. □ 

Lemma 4. If f * is OXS with at most R trees, then it is also unit-demand with at most n R leaves (with R-tuples as 
items). For constant R, the family J- of OXS functions with at most R trees is PAC-learnable using 0(Rn R \og(n/5) / e) 
training examples and time poly(n, 1/e, 1/5). 
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Proof. We start by noting that since /* is an OXS function representable with at most R MAX trees, then /* is 
uniquely determined by its values on sets of size up to R. Formally, 

f*(S)= r(S R ),VSC{l,...,n} (6) 

\S R \<R 

We construct a unit-demand /', closely related to /, on meta-items corresponding to each of the 0(n R ) sets of at 
most R items. In particular, we define one meta-item i s r to represent each set S R C { 1, . . . , n} of size at most R and 
let 

f(i s n)=f*(S R ). 

We define /' as unit-demand over meta-items; i.e. beyond singleton sets, we have 

f'({i S R, ...,i s r})= max f'(i s n). (7) 

1 L I — 1,...,L ( 

By equations © and ©, for all S we have f*(S) = f(I s ) where I s = {i S R : S R C S}. 

Since we can perform the mapping from sets S to their corresponding sets Is over meta-items in time 0(n R ), this 
implies that to PAC-learn /*, we can simply PAC-learn /' over the 0{n R ) meta-items using Algorithmic Lemma|3] 
guarantees that this will PAC-learn using 0(n R \og(n R /5)/e) training examples and running time poly(n, l/e,l/S) 
for constant R. □ 



C Additional Result for Theorem |6] 

ifTTl proved that a certain matroid rank function /j?, a ' J1 g(-), defined below, is hard to learn everywhere with value 
queries to an approximation factor of o(-v/j^)- We show that the rank function /ij >a ' il g(-) is in OXS (all leaves in 
all OXS trees will have value 1). /r iQ: ',^(") : 2^ 1 ' - '™^ — > R is defined as follows. Let subset R C {l,...,n} and 
R = ({1, . . . , n})\R its complement. Also fix integers a' , (3 E N. Then 

/ft, a ^(5)=min( ( 5 + |5n5|,|5|,a / ). VS C {1, . . . , n} (8) 

As a warm-up, we show that a simpler function than fn <a ' is in OXS. This simpler function essentially 
corresponds to j3 = and will be used in the case analysis for establishing that /fl ia ' ll a(-) is in OXS. 

Lemma 5. Let R' C {1, . . . , n} and c G N. r/ie« the function /(■) : 2^ 1 '-'"} R defined as 

f R ,, c (S) = min(c, |5 n #|), VS C {1, . . . , n] (9) 

is in OXS. 

Proof. For ease of notation let /(■) = /k', c (-)- We assume c < |i?'|; otherwise, /(S) = |5ni?'|,V5C {1, ...,n}, 
which is a linear function (f(S) = J2 x es ^(■ E ) wnere f{ x ) = 1 if x G i2' and /(a;) = otherwise) and any linear 
function belongs to the class OXS |;24| . Assuming c < we construct an OXS tree T with c MAX trees, each 
with one leaf for every element in R'. All leaves have value 1. We refer the reader to Fig. |l(a)| 

Then T(S) = f(S),WS C {1, . . . , n} since /(5 1 ) represents the smaller of the number of elements in S n R' (that 
can each be taken from a different MAX tree in T) and c. Note that T(5) <c,VSC{l,...,n}, □ 

Lemma 6. The matroid rank function /_r. q './3(') is in the class OXS. 

Proof. For ease of notation let /(■) = f^ a ', /?(')■ If n < a' ther@ |5| < a',V5 C {1, . . .,n} and 

/(S) = min(/3+ |5nfl|, \S\) = \S n i?| + min(/3, |STlii|) (10) 

8 We note that n > a' in 1171 . We consider this case for completeness. 
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r l ... r |R'| r l ... r |R'| 




Y 



c MAX trees for R' 



a'- 3 MAX trees for R 3 MAX trees for l..n 



(a) OXS representation for the function 
in Eq. (5). 



(b) OXS representation for the function in Eq. {8) when n > a' > /3. 



Figure 1: OXS representations. All leaves have value 1. 



From the proof of Lemma|5]we get that the function f(S) = min(/?, |S n R\) has an OXS tree V (i.e. T'(S) = 
/'(S), VS C {1, . . . , n}) with j3 MAX trees each with leaves only in R. We can create a new tree T by adding 
|.R| MAX trees to T' , each with one leaf for every element in R, and we get T(S) = f{S), VS C {1, . . . , n}. The 
additional \R\ MAX trees encode the \SC\R\ term in Eq. (Tj0]). If a' < then f(S) = min(a', |S|); the claim follows 
by Lemma|5]for c = a', R' = {1, . . . , n}. We can thus assume that n > a' > /3. We prove that the OXS tree T, 
containing the two types of MAX trees below, represents /, i.e. T(S) = f(S) , VS. We refer the reader to Fig. |l(b)| 

• a' — P MAX trees T\ . . . T a i^p, each having as leaves all the elements in R with value 1. 

• MAX trees T a i-p + i . . . T a >, each having as leaves all the elements (in {1, . . . , n}) with value 1. 

We note that T(S) < min(|S|, a') as no set S can use more than |S| leaves and T has exactly a' trees. We distinguish 
the following cases 

• |S| < P implying f(S) = |S| and T(S) — |S| as |S| leaves can be taken each from |S| trees in Tq/.^+i ...T a >. 

• f(S) = a' < min(/3 + |S n R\, |S|). We claim T(S) > a'. There must exist oi - /3 elements in |S n R\, that 
we can select one from each tree T\ . . . T a i^p. Also |S| > a' and we can take the remaining (3 elements from 

Ta'-fi+l ■ ■ ■ T a '. 

• f(S) = |S| < min(/3 + |S n R\,a'}. This implies \S H R\ < (3 and |S| < a'. We claim T(S) > |S|: we 
can take all needed elements in S n i? from Ti . . . T a /^p (and from T a '_ i g_|_i . . . T a > if |S fl R\ > a' — (3) and 
elements in S n R from T cl ,/_ / g_|_x • ■ • T a >. 

• f(S) = (3 + \S n R\ < min(|S|, a'). We claim T(S) > ^ + |S n we can take (3 < \S n i?[ elements from 
Ta'-/j + i . . . T Q / andJS n fl| < a' - /3 elements from Ti . . . T a >_p. Finally, T(S) < /3+\SnR\ since at most 
all elements in S n -R can be taken from T\ . . . T a >-p and at most /3 elements in S n i? from T Q /_,g + i . . . T a /. 



□ 
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